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Abstract: Consider the nonlinear regression model 

Yi=g{xi,e) + ei, i = l,...,n (1) 



(y-^ I with Xi G K'', e = (9o, ^i, ■ ■ ■ , Op)' G © (compact in RP+^), where g(x, 9) = 

6q + g(x, 01, ... , Bp) is continuous, twice differentiable in 6 and monotone in 
components o{0. Following Gutenbrunner and Jureckova (1992) and Jureckova 
and Prochazka (1994), we introduce regression rank scores for model (1), and 
' prove their asymptotic properties under some regularity conditions. As an 

' application, we propose some tests in nonlinear regression models with nuisance 

, , parameters. 

1. Introduction 

Consider tlic nonlinear regression model 
>■ (1.1) Y,=g{x„e) + ei, i = l,...,n 

o : 

I where Y = {Yi, . . . , Yn)' is a vector of observations, :>Ci £ X C M^, i ~ 1, . . . ,n are 

CO ' given vectors, ei, . . . ,e„ are i.i.d. errors with a positive (but generally unknown) 

CN ; density / and = (6'o, 6*1, ... , Op)' is an unknown parameter. We assume that 9q 

If^ ■ is an intercept, i.e. that g{x,0) ~ 9o + g(x, 6'i, . . . ,9p). Koenker and Bassett [14] 

I introduced the a-regression quantile and a-trimmed least squares estimator for the 

QP ■ linear regression model. Their idea is very natural and regression quantiles soon 

became very popular among applied statisticians and cconomctricians. Gutenbrun- 
ner and Jureckova [■'] showed that the variables, dual to regression quantiles in the 
parametric linear programming sense, extend the rank scores to the linear regres- 
sion model; such dual regression quantiles were called the regression rank scores 
I (RRS). They are invariant to the (linear) regression, and as such are suitable for 

construction of tests in the presence of nuisance regression. Such tests were consid- 
ered already by Gutenbrunner and Jureckova [3], and then Gutenbrunner et al. [4] 
constructed a general class of tests of linear hypothesis based on regression rank 
scores. Koul and Saleh [17] extended the regression quantiles and regression rank 
scores to the linear autoregressive time series. The autoregression rank scores were 
then used for testing by Hallin et al. [7], Hallin et al. [8], Hallin and Jureckova [d], 
Kalvova et al. [12], among others. 
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The basic definition of the a-regression quantile can be naturally extended to 
other models, including nonlinear regression. Chen [1] used the technique of the lin- 
earization of the regression function. Jureckova and Prochazka [10] proved the con- 
sistency and asymptotic normality of regression quantilcs in model (1.1), covering 
the logistic regression and a mixture of two exponentials as a regression function. 
There already exists a rich literature on nonlinear regression quantiles and their 
computation. We refer to Kocnker and Park [['>] and to Kocnker [l-H], where many 
other references are cited. 

A natural idea is to define the nonlinear regression rank scores as some form of 
duals to nonlinear regression quantilcs. However, such dual variables do not retain 
the advantages of the RRS in the linear model, mainly they are not invariant to 
the (nonlinear) regression. Mukherjee [18], inspired by the dual steps of Koenker 
and Park [1 5] in their interior point algorithm for nonlinear regression quantilcs, 
proposed the regression rank scores for a nonlinear time series model. His RRS are 
not a straightforward extension of RRS of Gutenbrunner and Jureckova [3], but 
under some further regularity conditions their asymptotic behavior is analogous 
to that of the linear RRS. Mukherjee [ix] proved the asymptotic representations 
of the regression rank scores process and of the nonlinear rank statistics, but to 
a construction of tests in models affected by a nonlinear regression with nuisance 
parameters he missed the (even asymptotic) invariance of the regression rank scores 
to this type of regression. 

Following Jureckova and Prochazka [\ 0], Koenker and Park [15], Mukherjee [18], 
and El- Attar et al. [2], we shall consider a possible version of regression rank scores 
in model (1.1). Our ultimate goal is their possible application in testing with nui- 
sance (nonlinear) regression. 

2. Regression rank scores 

We shall work with the model (1.1) under the conditions of Jureckova and 
Prochazka [10], namely: 

(A.l) The function g{x, 6) : X x 6 has the form 

,9(x,6>) = 0o + .g(x,(r), xeA-, r = ((?!,..., 0p)', {9o,e*'y e& 

with some function 5 of x and of 9* . 

We assume that function g(x, 9) is strictly monotone and twice differentiable 
in every component of = {9o,6i, . . . ^Op)'- The first and second derivatives are 
bounded by i^, < < c», uniformly in X and 0. 

(A. 2) The parameter space and the space A" of x arc compact. 

(A. 3) (Identifiability). Every set (yi, xi), . . . , (yp+i, Xp+i) of p + 1 different 
points determines uniquely £ such that 

yi^g{xi,9), i^l,...,p+l. 

(A. 4) The errors ei, . . . , e„ are independent, identically distributed with a sym- 
metric, positive and bounded density /, that has a bounded derivative /'. 
(A. 5) There exist finite positive constants fci, k2 such that, for n > hq, 

1 " 

ki\\92~9,\\^ < -^[5(x„02)-5(x„0i)]2 <fc2ll02-0i||' 
where ]] • \\ stands for the Euclidean norm. 
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(A.6) Put 



and denote V„(6>) = V(0) = v,j{9) 



6=0 

j=Q,...,p 



j = 0, 1,. . . ,_p 



We shall assume that 



z— l,...,n 

hm Q„(0) = Q(0) 

n — *oo 

where Q„(0) = iV^(0)V„(0) and Q(0) is a positively definite matrix of order 
(p + 1) X (p + 1). Moreover, we assume that 



(2.1) 



1 " 

-V ||vi(0)||4 = 0(1) as n^oo 



where v^(0) is the i-th row of V„(0), i = 1, . . . , n. 

The motivation for and the validity of conditions (A.1)-(A.6) are discussed by 
Jureckova and Prochazka [10]; they are in correspondence with the practical prob- 
lems studied by these authors. Conditions (A. 3) and (A.6) are also assumed by 
Mukherjee [18] with the difference that he replaces (2.1) with 
maxi<i<„ ||v^(0)[| = o{n^). Mukherjee, considering the time series model, assumes 
neither the monotonicity of g in the components of 9 nor the positivity of x, but 
other conditions suitable for the AR model; these conditions are also interpreted 
by Koul [IG]. 

The regression a-quantile 6na of model (1.1) is the minimizer of 



(2.2) 



^Pa{Yi ~ g(xi,t)) min 



^, where 



with respect to t G 

(2.3) pa(z) = |z| {al[z > 0] + (1 - a)I[z < 0]} , z e 

Regarding (2.2), Ona can be also defined as a component t of the solution 
(t,r+,r-) e RP+i X R!J: X of the minimization 



subject to 
We can also write 



.^r+ + (l-a)^ 

1=1 i=l 

t = -g(xi,t 



r, := mm 



^p„(y,-5(x„g„„)) = ^(K, - g(xj,0,iQ,)) il\Yi > g{xi,6na)] 



(2.4) 



n 

^(1 - a)) = ^ [E^a - [g{^^,e + T„) - g{x„e)] 
1=1 

I[E^a > 5(x., 6» + T„) - .g(x„ 9)] - (1 - c 
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where 

(2.5) E^a — ei ~ F~-^{a), i = l,...,n and 

T„ = e„a-Oa and 6/„ = + eiF"i(a), ei = (1, 0, . . . , 0)'. 

By El- Attar. Vidyasagar, and Dutta [2], if 6na minimizes (2.2), then there exists a 
vector a„(a) G [0, 1]", < a < 1, such that 



(2.6) ani{a) 



1 if Yi> g{lii,6na) 

if Fj < g{:x:i,0na), i = l,. 
(2.7) ^Wy(§„a)K,(a)-(l-a)]=0, i = 0,l,. 



1 " 

(2.8) - V(r, - ff(x„ 0„„))[a„(a) - (1 - a)] 

n — ' 

i=l 
1 " 

Tl ^ ^ 



i=l 



Remark 2.1. In fact, (2.6) foUows from (2.8). 

Hence, we can define the regression rank scores ani(a), . . . , a„„(a) as one of the 
vectors satisfying (2.6)-(2.8); because the set A of such vectors is convex, we can 
define a„(a) a,s a € A maximizing X]r=i ^i^'i- UnUkc Mukhcrjee [IN], we profit from 
the equation (2.7). Notice that (2.7) among others imphcs 

n 

(2.9) ^a„j(a) n(l - a), < a < 1, 
hence, by the continuity, 

(2.10) a„,(0) = l, a„,(l) = 0, i = l,...,n. 
Regarding (2.4), we can rewrite (2.6)-(2.8) in the form 

r 1 if >5(x„0 + T„)-5(x„0) 

(2.11) a„,(a) = <^ 

to if E,c,<g{x,,e + Tn)-g{^r,e), i^l,...,n, 

n 

(2.12) ^wy(§„„)[a™(a)-(l-a)] = 0, j = 0,l,...,p, 

i=l 
1 " 

- E - [5(x., + T„) - 5(x„ 0)]) [a„,(a) - (1 - a)] 

2—1 
1 ^ 

(2-13) = - [5(x„0 + T„)-5(x,,0)]). 

Jurcckova and Prochazka [10] proved that, under conditions (A.1)-(A.6), 

(2.14) 9na Oa = Op{n-^^^) as n -> oo, 

(2.15) n^^\dnc.-e^) 

1 " 

= nV2/(f-i(a)) I]v,(6))^„(£;,^) + 0p(l) as n -> oo 
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uniformly in e < a < 1 — e, Ve £ (0, where 

(2.16) = I[u > 0] - (1 - a), uE R. 
Expanding g(xi, 6na) ~ .^(xi, 9 a) around 0„q, wc obtain for i = 1, . . . , n 

g{:x.i,dna) - giy.i,9a) = do,na - - F^^ (a) +5(x,;,0„„) - g{lii,6*) 

V 

(2.17) = ^o,na - ^0 - -F"'(a) + ^ % (^j,na)(^j,„a " Q ,) 

J = l 

-(Sl-e*)'B«(^n)(§L-r)/2 

with dn between Q* and 0„q,, and with 

Inserting (2.17) in (2.13) and regarding (2.11), (2.14) and condition (A.l), we 
obtain 



(2.18) 



\ J2 ~ [5(x., + T„) - 5(x„ 0)]) [a,(a) - (1 

i=l 

I " 

= -J^ E,a[a,{a) - (1 - a)] +a(l - a) • Op{n-^) 

II ^ — ^ 



uniformly in e < a < 1 — £, e G (0, i). On the other hand, it follows from Lemma 3.5 
in Jureckova and Prochazka [10] and its proof that 

1 " 

- ^ Pa (e, - [g(x„ e + T„) - g(x„ 0)]) 

1=1 

1 " 

= - Vpa(S»a) + a(l - a) ■ Op(n-i) 

1 " 
i=i 

uniformly in £ < a < 1 - £, e £ (0, \). Combining (2.11), (2.18) and (2.19) entails 
1 " 

(2-20) -^E,^(a,{a) - l\e,>F-\a)^ = a(l - a) ■ Op(n-i) 

i=\ 

uniformly in e < a < 1 — e, e e (0, \). This, in turn, further implies 

-| n 1 ^ 

(2.21) - V£;,„[a,(a)-(l-a)] - - V £;.„K(i?™;, a)-(l-a)]+a(l-a)-0(n-i) 
n -"^ — ' n — ' 

i=\ 1=1 

uniformly in e < a < 1 ~ e, e S (0, ^), where a1^{Rni, a) are Hajek's rank scores. 



0, 



if ^ < a. 



(2.22) 



a;(i?„,, a) = { Rm ~a if < " < % 



if a < 



J7,„,-l 
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and Rni is the rank of e^, z = (see, e.g., Hajek and Sidak [■'3], Sec- 

tion V.3.5). 

Consider a triangular array {z„i, . . . , z„„} of vectors from W such that 

(Z.l) X)"=i ^ni — and ^ X]r=i '^ni'^'ni ^ C as TT, cx) whcrc C is a positive 
r X r matrix, 

(Z.2) maxi<j<„ ||z™|| = o{n'^/'^), 

and the nonhnear rank statistic process 

1 " 

(2.23) Z„(a) = -Vz„4a„,(a)-(l-a)], < a < 1. 

71 -"^ — ' 

1=1 

Let (/3 : (0, 1) I— > R be a monotone function. Fix e G (0, ^) and put 





\ fie) 


if 





<u<e 


(2.24) ^,{u) ^ 1 




if 


e 


< u < 1 -e 






e) if 


1 


- £ < M < 1 


and define the scores 










(2.25) bm = ^ 


1 ani{a)dipe{a), 




i 


= 1, ... ,71. 



For example, ip{u) ~ u — ^ (Wilcoxon score function) or 

-1 if Q<u< k 



2 



ip{u) ^ { if ?■ - 1 



2 

1 if i < 71 < 1 



(median score function). The vector of nonlinear rank statistics 

n 

(2.26) S„ = ^ ^ Znjbni 



will serve us as a basis for a construction of tests. 

Remark 2.2. Gutcnbrunner ct al. [4] and Hallin and Jurcckova [()] considered the 
scores of type (2.25) with (^e replaced by a nondecreasing, square- intcgrablc score 
function ip : (0,1) i— > M, satisfying conditions of Chernoff-Savage type, including 
the normal scores. However, they were not able to construct the tests with nui- 
sance linear regression for / with heavy tails. Jureckova [!)] admitted heavy-tailed 
distributions, but her scores were truncated to (e„, 1 — £„) with e„ | 0. Under the 
conditions (A.1)-(A.3) and using the methods of Jureckova and Prochazka [10], 
we are able to guarantee the uniformity in (2.15) only on [e, 1 — e] with a fixed 
e £ (0, ^). On the other hand, we do not restrict the tails of the distribution. 

A possible extension of the subinterval [e, 1 — e] to [e„, 1 — e„] or to (0, 1) will be 
an object of a further study. 

3. Properties of nonlinear rank statistics 



We shall first prove the lemma. 
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Lemma 3.1. Let {z„i, . . . , z„„} be a triangular array of vectors from M.^ satisfying 
(Z.l) and (Z.2), let 

^ z' 



'-'nl 



Then, under conditions (A.1)-(A.3), 

n 

sup \n-^/'^\\ V \z,ui^c.{E,a - [g(x,, e + T„) - g{x,, 9)]) 

e<a<l-E ^ II ^ I- 

— — I— 1 

(3.1) -(Z„, - Z„)V'a(-B«a)||} ^ 

as n 00, for any fixed e £ (0, where z^j is i/ie row of the projection Ttr, 
of Z„ in the space spanned by the columns of matrix V„(0), i.e. 

(3.2) Z„ = H„(0)Z„, 

H„(0) = v„(0) [v;(0)v„(0)']-' v;(0). 

Proof. It follows from Lemma 3.5 in Jureckova and Prochazka [10] that 

1 " 1 

(3.3) -Y^ipaiE,^ - g{^^,e + t)) - p^{E,^)] = -f{F-\a))t'Qnt 

i=i 
1 " 

(3.4) -t'-V v,;(0)Vj„(i;,„) + Op 



,-l/2|H..||3/2 



uniformly ine<a< 1— e and in ||t|| < r„, for every sequence {r„} of positive 
numbers tending to 0. Now, extend the model (1.1) in the following way: 



(3.5) 



Then the conditions (A.1)-(A.3) are satisfied even for extended model (3.5), re- 
placing 6 by {O' ,'&')' . The function is absolutely continuous and convex; taking 
the right derivative of (3.3) with respect to last r coordinates of t (evaluated when 
the last r coordinates of t are zero), we obtain 



sup in L ^^^(^g.^ _ [^(x^^6/ + t) -g(xi,0)]) 

e<^<l-e ^ II L 

-(z„, - Z„,;)V'a(£'jQ,)j III = Op(l) 

uniformly in t £ M^^^, ||t|| < r„, for every sequence {r„} of positive numbers 



(3.6) 



tending to 0. Inserting t i~+ Ona — da into (3.6), we arrive at (3.1). 



□ 



The following corollary approximates the regression rank scores by an empirical 
process. 

Corollary 3.1. Under the conditions of Lemma 3.1, 



(3.7) 



n 

sup i 77,"^/^ z„ja„i(a) - (z„, - z„i)/[ei > i^~^(a)] 

e<a<l-e ^ H ~[ L -I 



for any fixed e G (0, i). 
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Proof. Notice that 



n 

sup \n^^/'^\\y^ZniI[Yi=g{xi,9na)] \ — 

E<a<l-e ^ II ~r ^ 



hence, regarding (2.6), (3.7) follows from Lemma 3.1. 
Corollary 3.2. Under the conditions of Lemma 3.1, 



□ 



(3.; 



n 

sup in"^/^ z„ia„i(a) - (z„j -z„i)a*(i?„i,Q;) [ 

;Q<i-e II ~r I- -I J 



as n —> oo, for any fixed e G (0, i), where a'^{Rni, a) are Hdjek's scores defined in 
(2.22). 

Let (/J : (0, 1) I— *■ M be a monotone function, fix e G (0, i) and consider the scores 
(2.25) and the nonlinear rank statistics (2.26). Then Corollaries 3.1, 3.2 imply 
Corollary 3.3. Under the conditions of Lemma 3.1, for any fixed e G (0, |), 



,-1/2 



E 



(3.9) n ^/^l ^ \zmhni - (Zm - iyii)^ 

where 



Zm&rii - (z™ - z„i) / I[ei > F '^{a)]dip.{a) 

- Op(l), 



Op(l), 



i=l 



a^{Rni,a)dipe{a), i = l,...,5 



Hence, under the model (1.1), the nonlinear rank statistic (2.26) is asymptotically 
equivalent to the linear rank statistic n^^^'^'Yl^=i{'^ni ~ ^nijl^ni pertaining to the 
Hdjek rank scores. 



4. Application: Tests of linear regression in nonlinear regression model 
with unknown parameters 

Convenient properties of the nonlinear regression rank scores, proved in the former 
sections, lead to an idea of their possible application in testing the significance of a 
linear regression in the presence of a nonlinear regression with nuisance parameters, 
or in testing other hypotheses with nuisance parameters of nonlinear regression. For 
instance, we can compare two sets of observations affected by a nonlinear regression 
with unknown parameters. 

Let us illustrate possible applications on the nonlinear regression model 

(4.1) Fi =g(x„i,0) +z^/3 + ei, i = l,...,n 

where (3 G W is an unknown parameter and z„i, i = l,...,n, are known (or 
observable) regressors. Denote 




'nn 
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the matrix of order n x r and assume that it has rank r. The problem is that of 
testing the hypothesis 

Ho : (3^0 

with g{-,-) of known shape and with known x^, i = 1, ... ,71, but with 6 and F 
unspecified; we only assume that conditions (A.1)-(A.3) and (Z.1)-(Z.2) are 
satisfied. 

As an example consider the situation where the nonlinear regression describes the 
concentration of the drug teofylin in the human blood, measured at times ti, . . . ,tk 
after the application. In such situation, the regression function is typically a mixture 
of exponentials with unknown parameters (see Jureckova and Prochazka [10] and 
Schindler [19]). Treating two groups of patients (boys and girls), wc want to compare 
their reactions to the drug. This leads to a two-sample problem with a nuisance 
nonlinear regression. 

Let a„i(a), . . . , a„„(a), < a < 1 denote the regression rank scores correspond- 
ing to the submodel under Hq, 

(4.2) =g(x,,6>) + e,, i^l,...,n. 

Let (fi : (0, 1) R be a nondecreasing, bounded score function such that Lp{l — u) = 
—ip{u), < u < 1. Fix £ g (0, ^), calculate the scores bni, i = 1, . . . ,n, defined in 
(2.25), and the test criterion 

n 

(4.3) = (A((^,))-2s;d,-1S„, S„ = 71-1/2 ;^z„,6„, 

1=1 

where 

1 /.I 

2, 



{A{ips)) / (ipeiu) ~ ips) du, (fe = ipe{u)du 



and 



D„ — — I Z„ — Zn I I Zn — Z 

n 



n I 5 



and Tin is the projection of Z„ in the space spanned by the columns of matrix 
V„(0„) with On being the nonlinear regression quantile of model (4.2) with a = ^, 
i.e. 

Zri = H„(0„)Z„, 

Hnidn) = Vnidn) (0„) (^„ )) KiOn). 

The test is based on the asymptotic distribution of T„ under Hq. Regarding that 

V„(0„)- V„(6>) ^0 as 71 ^00, 

the asymptotic distribution of Tn under Hq, due to the Corollary 3.1, in turn 
coincides with the asymptotic distribution of the sequence 

(A(^,))-2s;'D-is,*, S;^n-V2^(z„,-z„,)^V ^ 



i=l 



n+1 



where Rni is the rank of e^, i ~ 1, . . . ,n. Hence, the test of Hq in the presence of 
nuisance nonlinear regression would coincide with the ordinary rank test under 6 
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known, with score generating function ip^ and with the coefficients z„i — z„i, i = 
1, . . . , n. The asymptotic null distribution of %i under Hq is the central with r 
degrees of freedom. The asymptotic distribution of Tn under the local alternative 

H„ : /3„ ^ n-^/^l3o (/3o e K'' fixed) 

is the noncentral with r degrees of freedom and with nonccntrality parameter 



1 -.2 

^,{u)df{F-\u)) 







D hm 

n — *oo 



If /? G K^, i.e. r = 1, we can test Hq : f3 = Q also against the one-sided alternative 
Hi : (3 > 0, what is most interesting in the two-sample model. Then the test 
criterion simplifies to 



n 



and rejects Hq in favor of Hi on the asymptotic significance level r provided T* 
exceeds the (1 — T)-quantile of the standard normal distribution, i.e. > $^^(1 — 
r). The performance of such test on the real data is illustrated by Schindler [19], 
who has also elaborated a suitable computation algorithm. Neither the Wilcoxon 
nor the median tests indicate a difference in the dynamism of spreading the tcofylin 
between two groups of patients. 

A general class of tests using the regression rank scores, including their numerical 
behavior and the algorithms, is a subject of Schindler's PhD Thesis (Schindler [-■ ■]). 
Besides the multiple comparisons, we can also test for the independence of two 
random variables, affected by a nonlinear regression with unknown parameters, or 
test for the significance of an outward trend in a similar situation, and we have 
various other applications. 

Acknowledgment. I would like to thank Pranab Kumar Sen for all our cooper- 
ation, which has dated since 1980. I really learned much from him, and I am still 
learning. 
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